Goto

Collaborating Authors

 adjusted rand index


Toward Interpretable Evaluation Measures for Time Series Segmentation

Neural Information Processing Systems

Time series segmentation is a fundamental task in analyzing temporal data across various domains, from human activity recognition to energy monitoring. While numerous state-of-the-art methods have been developed to tackle this problem, the evaluation of their performance remains critically limited. Existing measures predominantly focus on change point accuracy or rely on point-based measures such as Adjusted Rand Index (ARI), which fail to capture the quality of the detected segments, ignore the nature of errors, and offer limited interpretability. In this paper, we address these shortcomings by introducing two novel evaluation measures: WARI (Weighted Adjusted Rand Index), that accounts for the position of segmentation errors, and SMS (State Matching Score), a fine-grained measure that identifies and scores four fundamental types of segmentation errors while allowing error-specific weighting. We empirically validate WARI and SMS on synthetic and real-world benchmarks, showing that they not only provide a more accurate assessment of segmentation quality but also uncover insights, such as error provenance and type, that are inaccessible with traditional measures.





Appendices 619 A Additional Experiments 620

Neural Information Processing Systems

Table 6: Results of selected models on Task 1 (Grouping) using contextual embeddings. In this section, we provide additional t-SNE projections of embeddings from various methods used. Figure 7: Solved wall for Task 1 (Grouping) using GloV e. Left: ( " Suspension" is " a term used in musical harmony " in this context. Grief " in the embedding space, which matches the " Good ___! " connection. Figure 8: Solved wall for Task 1 (Grouping) using FastText (Crawl). Left: contextual embedding solved 3/4 groups. Here the clue " Rambrandt" is placed near other Dutch painters. Right: static embedding solved 0/4 groups. The following section provides answers to questions listed in datasheets for datasets. For what purpose was the dataset created? Was there a specific task in mind? Who created this dataset (e.g., which team, research group) and on behalf of which entity (e.g., The dataset has been collectively curated by the authors of this paper. What support was needed to make this dataset?



A Hybrid Computational Intelligence Framework for scRNA-seq Imputation: Integrating scRecover and Random Forests

arXiv.org Artificial Intelligence

Single-cell RNA sequencing (scRNA-seq) enables transcrip-tomic profiling at cellular resolution but suffers from pervasive dropout events that obscure biological signals. We present SCR-MF, a modular two-stage workflow that combines principled dropout detection using scRecover with robust non-parametric imputation via missForest. Across public and simulated datasets, SCR-MF achieves robust and interpretable performance comparable to or exceeding existing imputation methods in most cases, while preserving biological fidelity and transparency. Runtime analysis demonstrates that SCR-MF provides a competitive balance between accuracy and computational efficiency, making it suitable for mid-scale single-cell datasets.


Oh That Looks Familiar: A Novel Similarity Measure for Spreadsheet Template Discovery

arXiv.org Artificial Intelligence

Traditional methods for identifying structurally similar spreadsheets fail to capture the spatial layouts and type patterns defining templates. To quantify spreadsheet similarity, we introduce a hybrid distance metric that combines semantic embeddings, data type information, and spatial positioning. In order to calculate spreadsheet similarity, our method converts spreadsheets into cell-level embeddings and then uses aggregation techniques like Chamfer and Hausdorff distances. Experiments across template families demonstrate superior unsupervised clustering performance compared to the graph-based Mondrian baseline, achieving perfect template reconstruction (Adjusted Rand Index of 1.00 versus 0.90) on the FUSTE dataset. Our approach facilitates large-scale automated template discovery, which in turn enables downstream applications such as retrieval-augmented generation over tabular collections, model training, and bulk data cleaning.


Toward Interpretable Evaluation Measures for Time Series Segmentation

arXiv.org Artificial Intelligence

Time series segmentation is a fundamental task in analyzing temporal data across various domains, from human activity recognition to energy monitoring. While numerous state-of-the-art methods have been developed to tackle this problem, the evaluation of their performance remains critically limited. Existing measures predominantly focus on change point accuracy or rely on point-based measures such as Adjusted Rand Index (ARI), which fail to capture the quality of the detected segments, ignore the nature of errors, and offer limited interpretability. In this paper, we address these shortcomings by introducing two novel evaluation measures: WARI (Weighted Adjusted Rand Index), that accounts for the position of segmentation errors, and SMS (State Matching Score), a fine-grained measure that identifies and scores four fundamental types of segmentation errors while allowing error-specific weighting. We empirically validate WARI and SMS on synthetic and real-world benchmarks, showing that they not only provide a more accurate assessment of segmentation quality but also uncover insights, such as error provenance and type, that are inaccessible with traditional measures.